238 research outputs found

    Polyphonic Sound Event Detection by using Capsule Neural Networks

    Full text link
    Artificial sound event detection (SED) has the aim to mimic the human ability to perceive and understand what is happening in the surroundings. Nowadays, Deep Learning offers valuable techniques for this goal such as Convolutional Neural Networks (CNNs). The Capsule Neural Network (CapsNet) architecture has been recently introduced in the image processing field with the intent to overcome some of the known limitations of CNNs, specifically regarding the scarce robustness to affine transformations (i.e., perspective, size, orientation) and the detection of overlapped images. This motivated the authors to employ CapsNets to deal with the polyphonic-SED task, in which multiple sound events occur simultaneously. Specifically, we propose to exploit the capsule units to represent a set of distinctive properties for each individual sound event. Capsule units are connected through a so-called "dynamic routing" that encourages learning part-whole relationships and improves the detection performance in a polyphonic context. This paper reports extensive evaluations carried out on three publicly available datasets, showing how the CapsNet-based algorithm not only outperforms standard CNNs but also allows to achieve the best results with respect to the state of the art algorithms

    Classification of bearing faults through time-frequency analysis and image processing

    Get PDF
    The present work proposes a new technique for bearing fault classification that combines time-frequency analysis with image processing. This technique uses vibration signals from bearing housings to detect bearing conditions and classify the faults. By means of Empirical Mode Decomposition (EMD), each vibration signal is decomposed into Intrinsic Mode Functions (IMFs). Principal Components Analysis (PCA) is then performed on the matrix of the decomposed IMFs and the important principal components are chosen. The spectrogram is obtained for each component by means of the Short Time Fourier Transform (STFT) to obtain an image that represents the time-frequency relationship of the main components of the analyzed signal. Furthermore, Image Moments are extracted from the spectrogram images of principal components in order to obtain an array of features for each signal that can be handled by the classification algorithm. 8 images are selected for each signal and 17 moments for each image, so an array of 136 features is associated with every signal. Finally, the classification is performed using a standard machine learning technique, i.e. Support Vector Machine (SVM), in the proposed technique. The dataset used in this work include data collected for various rotating speeds and loads, in order to obtain a set of different operating conditions, by a Roller Bearing Faults Simulator. The results have shown that the developed technique provides classification effectively, with a single classifier, of bearing faults characterized by different rotating speeds and different loads

    Exploiting heterogeneous data for the estimation of particles size distribution in industrial plants

    Get PDF
    In industrial environments, it is often difficult and expensive to collect a good amount of data to adequately train expert systems for regression purposes. Therefore the usage of already available data, related to environments showing similar characteristics, could represent an effective approach to find a good balance between regression performance and the amount of data to gather for training. In this paper, the authors propose two alternative strategies for improving the regression performance by using heterogeneous data, i.e. data coming from diverse environments with respect to the one taken as reference for testing. These strategies are based on a standard machine learning algorithm, i.e. the Artificial Neural Network (ANN). The employed data came from measurements in industrial plants for energy production through the combustion of coal powder. The powder is transported in air within ducts and its size is detected by means of Acoustic Emissions (AE) produced by the impact of powder on the inner surface of the duct. The estimation of powder size distribution from AE signals is the task addressed in this work. Computer simulations show how the proposed strategies achieve a relevant improvement of regression performance with respect to the standard approach, using ANN directly on the dataset related to the reference plant

    A Time-Frequency Generative Adversarial based method for Audio Packet Loss Concealment

    Full text link
    Packet loss is a major cause of voice quality degradation in VoIP transmissions with serious impact on intelligibility and user experience. This paper describes a system based on a generative adversarial approach, which aims to repair the lost fragments during the transmission of audio streams. Inspired by the powerful image-to-image translation capability of Generative Adversarial Networks (GANs), we propose bin2bin, an improved pix2pix framework to achieve the translation task from magnitude spectrograms of audio frames with lost packets, to noncorrupted speech spectrograms. In order to better maintain the structural information after spectrogram translation, this paper introduces the combination of two STFT-based loss functions, mixed with the traditional GAN objective. Furthermore, we employ a modified PatchGAN structure as discriminator and we lower the concealment time by a proper initialization of the phase reconstruction algorithm. Experimental results show that the proposed method has obvious advantages when compared with the current state-of-the-art methods, as it can better handle both high packet loss rates and large gaps.Comment: Accepted at EUSIPCO - 31st European Signal Processing Conference, 202

    Learning to Rank Microphones for Distant Speech Recognition

    Full text link
    Fully exploiting ad-hoc microphone networks for distant speech recognition is still an open issue. Empirical evidence shows that being able to select the best microphone leads to significant improvements in recognition without any additional effort on front-end processing. Current channel selection techniques either rely on signal, decoder or posterior-based features. Signal-based features are inexpensive to compute but do not always correlate with recognition performance. Instead decoder and posterior-based features exhibit better correlation but require substantial computational resources. In this work, we tackle the channel selection problem by proposing MicRank, a learning to rank framework where a neural network is trained to rank the available channels using directly the recognition performance on the training set. The proposed approach is agnostic with respect to the array geometry and type of recognition back-end. We investigate different learning to rank strategies using a synthetic dataset developed on purpose and the CHiME-6 data. Results show that the proposed approach is able to considerably improve over previous selection techniques, reaching comparable and in some instances better performance than oracle signal-based measures

    Multi-household energy management in a smart neighborhood in the presence of uncertainties and electric vehicles

    Get PDF
    none4noThe pathway toward the reduction of greenhouse gas emissions is dependent upon increasing Renewable Energy Sources (RESs), demand response, and electrification of public and private transportation. Energy management techniques are necessary to coordinate the operation in this complex scenario, and in recent years several works have appeared in the literature on this topic. This paper presents a study on multi-household energy management for Smart Neighborhoods integrating RESs and electric vehicles participating in Vehicle-to-Home (V2H) and Vehicle-to-Neighborhood (V2N) programs. The Smart Neighborhood comprises multiple households, a parking lot with public charging stations, and an aggregator that coordinates energy transactions using a Multi-Household Energy Manager (MH-EM). The MH-EM jointly maximizes the profits of the aggregator and the households by using the augmented ɛ-constraint approach. The generated Pareto optimal solutions allow for different decision policies to balance the aggregator’s and households’ profits, prioritizing one of them or the RES energy usage within the Smart Neighborhood. The experiments have been conducted over an entire year considering uncertainties related to the energy price, electric vehicles usage, energy production of RESs, and energy demand of the households. The results show that the MH-EM optimizes the Smart Neighborhood operation and that the solution that maximizes the RES energy usage provides the greatest benefits also in terms of peak-shaving and valley-filling capability of the energy demand.openLuca Serafini, Emanuele Principi, Susanna Spinsante, Stefano SquartiniSerafini, Luca; Principi, Emanuele; Spinsante, Susanna; Squartini, Stefan

    Genome sequence of Enterococcus mundtii EM01, isolated from Bombyx mori midgut and responsible for flacherie disease in silkworms reared on an artificial diet

    Get PDF
    The whole genome sequence of Enterococcus mundtii strain EM01 is reported here. The isolate proved to be the cause of flacherie in Bombyx mori. To date, the genomes of 11 other E. mundtii strains have been sequenced. EM01 is the only strain that displayed active pathological effects on its associated animal species
    corecore